# **Scalable embedded Realtime**

#### with OpenComRTOS

#### **Bernhard H.C. Sputh**

bernhard.sputh@altreonic.com, http://www.altreonic.com

From Deep Space to Deep Sea



Push Button High Reliability

## Outline



- History of Altreonic
- Scalability / Distribution
- OpenComRTOS
- Demonstrations
- Performance
- Conclusions

# **History of Altreonic**



- Eonic (Eric Verhulst): 1989 2001
  - Developed Virtuoso a Parallel RTOS (sold to Wind River Systems);
  - Communicating Sequential Processes as foundation of the "pragmatic superset of CSP";
- Open License Society: 2004 now
  - R&D on Systems and Software Engineering;
  - Developed OpenComRTOS using Formal Methods
- Altreonic: 2008 now
  - Commercialises OpenComRTOS;
  - Based in Linden (near Leuven) Belgium;

# Why Scalability is needed



- Building robots / systems out of smart sensors and actuators.
- Central control moves towards distributed control.



# **Scalability / Distribution**



- Application Domains:
  - Multi sensor fusion,
  - Image processing,
  - radar, sonar
- Applications can utilize additional resources.
  - Additional CPU-Cores
  - Additional communication links
- Potential problems of Distributed Control:
  - Design complexity increases
  - Probability of failure increases

# **OpenComRTOS**

**S** Altreonic

- Supported Targets
- OpenComRTOS Designer
- Open Tracer
- Open System Inspector
- Safe Virtual Machine
- Springer book:

#### **Formal Development of a Network-Centric RTOS**

Software Engineering for Reliable Embedded Systems

Verhulst, E., Boute, R.T., Faria, J.M.S., Sputh, B.H.C., Mezhuyev, V.

# **Supported Targets**

- Host Operating Systems:
  - MS-Windows 32
  - POSIX 32 (Linux 2.6 / 3.0)
- Native Support:
  - ARM-Cortex-M3
  - PowerPC e600
  - TI C66x
  - XMOS XS1
- Dormant Ports: Xilinx Microblaze, ESA Leon3, MLX16, NXP CoolFlux,

# **OpenComRTOS** Designer



- OpenComRTOS Designer, offers to:
  - Use  $1 2^{24}$  Nodes (CPU-Cores) in one System.
  - Support heterogeneous systems.
  - Use different communication technologies between Processing-Nodes (RS232, Ethernet, PCIe, RapidIO, etc.)
- Paradigms:
  - Interacting Entities
  - Virtual Single Processor (VSP) Programming Model
  - Distributed Real Time Support

# **OCR Designer Meta Models**



```
<platform type="arm-cortex-m3" variant="arm-cortex-m3" svgPath="chip.svg"</pre>
   version="1.5" help="OpenVE::OpenComRTOS::Node">
   <attribute name="name" type="string" unique="node" regexp="[A-Za-z0-9 ]+"/>
   <deviceDriver name="ethernetUip">
     <includeFile name="driver/stellarisEthernet.h"/>
     <structure type="stellarisEthernetDevice" label="dev">
       <attribute name="name" type="string" regexp="eth0" defaultValue="eth0"
         unique="deviceDriver"/>
       <attribute name="netmask" defaultValue="255.255.255.0" type="string"
         reqexp="..."/>
       <attribute name="defaultGw" defaultValue="0.0.0.0" type="string"
         regexp="..."/>
       <attribute name="host" type="string" regexp="..."/>
     </structure>
     <task name="rtxmitTask">
       <entrypoint value="stellarisEthernet rtxmitTask"/>
     </task>
     <task name="txTask">
       <entrypoint value="stellarisEthernet EntryPoint"/>
     </task>
     <event name="ethernetEvent"/>
     <event name="timerEvent"/>
     <lib name="driver"/>
     <initFunctionDevice name="stellarisEthernet initDevice"/>
   </deviceDriver></platform>
```

# **Interacting Entities**



- Entities:
  - Active Entities (Tasks)
  - Passive Entities (Hubs)
- Interactions:
  - Service Requests from a Task to a Hub;
  - Represented by *packet exchanges*, not function calls!
  - Have the following interaction semantics:
    - \_W: waiting / blocking
    - \_NW: non waiting
    - \_WT: waiting with timeout
    - \_A: asynchronous

# **Available Passive Entities (Hubs)**



- Port: Data exchange between Tasks
- Event: Boolean signal
- Semaphore: Counting Event
- Resource: Mutual Exclusion (Mutex / Lock)
  - Provides distributed Priority Inheritance.
- FIFO: Buffered data exchange between Tasks
- Memory Pool: Dynamic allocation of memory-blocks.

#### **Generic Hub Model**





# **Virtual Single Processor**



Separates two areas of concern:

- Hardware Configuration (Topology View)
- Application Configuration (Application View)

Benefits:

- Transparent parallel programming
- System wide priority management

# **Virtual Single Processor II**



Topology View consists of:

- Nodes (CPU-Cores)
- Links:
  - Prioritized packet communication between Nodes)



# Virtual Single Processor III



Application View, consists of the following entities:

- Tasks
- Hubs
- Interactions, OpenComRTOS routes them to their destination Entity.



# **Virtual Single Processor IV**



Topology Diagram Entities are represented by meta models (XML-based), which contain the information about the following:

- CPU-Core(s) (type, interconnect, compiler, ...)
- Devices and their Device Drivers
- Link-Ports
- File Templates for Node Entry Point (main()).
- Hierarchy information (SoC, board, rack, cluster)

This makes it easy to deal with complex SoCs such as the TMS320C6678 or the MPC8640D.

# **Open Tracer**



# Visualizes: Context Switches, Hub Interactions, Packet exchanges between Nodes.

| Open Event Tracer 3.5.3.2     |                                                          |
|-------------------------------|----------------------------------------------------------|
| File Tracer Views Project Nod | de View Help                                             |
| 📁 🕞 🕶 🚱 🕶 🧕 🐨                 |                                                          |
| C:/OpenComRTOS-Suite          | e-1.4/Demos/03_SemaphoreLoop_Tracing_MP_TCPIP/Output/bin |
|                               | /Demos/03_SemaphoreLoop_Tracing_MP_TCPIP/Output/bin      |
| [0] Win32Node                 |                                                          |
| 🝃 Tasks:                      |                                                          |
| [0] KernelTask                |                                                          |
| 🔵 [1] IdleTask                |                                                          |
| [2] txTask_winsocket          |                                                          |
| (3) Task1                     |                                                          |
| [4] Shs_Task                  |                                                          |
| Hubs:                         |                                                          |
| [0] Sema1                     |                                                          |
| [1] Shs [2] Shs_OUT           |                                                          |
| [2] Shs_Resource              |                                                          |
| [4] RxPacketPool              |                                                          |
| [5] KernelPacketPool          |                                                          |
| [1] ArmNode                   |                                                          |
| 🕒 Tasks:                      |                                                          |
| 🔵 [0] KernelTask              |                                                          |
| [1] IdleTask                  |                                                          |
| [2] rtxmitTask                |                                                          |
| [3] txTask_ethernetUip        |                                                          |
| 🔵 [4] Task2                   |                                                          |
| 运 Hubs:                       |                                                          |
| [0] ethernetEvent_ArmNode_    |                                                          |
| [1] timerEvent_ArmNode_eth    |                                                          |
| [2] Sema2                     |                                                          |
| [3] RxPacketPool              |                                                          |
| [4] KernelPacketPool          |                                                          |
| •                             | 4                                                        |

# **Open System Inspector**



Allows, to inspect and modify the state of the system during runtime:

- Monitoring of the Hub state
- Peek and Poke of memory regions
- Starting and Stopping of Tasks.



# **Safe Virtual Machine**



- Goals:
  - CPU independent programming
  - Low memory needs (embedded!)
  - Mobile, dynamic code => "embedded apps"
  - Allows to reuse legacy binary code on any processor
  - Formal development approach (SVM is generated from description)
- Results:
  - Selected ARM Thumb1 instruction set of VM target
    - Widely used CPU
    - < 3 Kbytes of code for VM</li>
    - Executes binary compiled code
    - Capable of native execution on ARM targets
  - VM enhanced with safety support (option):
    - Memory violations
    - Stack violations
    - Numerical exceptions

# **SVM System Composition**





#### **Demonstrations**



- Single Node Semaphore Loop
- Multi Node Semaphore Loop
- Open Tracer
- Protecting a Shared Resource
- Open System Inspector
- Safe Virtual Machine for C
- Interrupt Latency
- eWheel Controller Simulation

# Single Node Semaphore Loop



- Goal: Implementing a Semaphore Loop:
  - 1. Create a Topology with one Win32 Node;
  - 2. Create two Tasks;
  - 3. Create two Semaphore Hubs;
  - 4. Establish the Interactions between Tasks and Hubs;
  - 5. Compile the project;
  - 6. Execute the project.

# Multi Node Semaphore Loop



- Goal: Execute the Semaphore Loop distributed over two Nodes:
  - 1. Extend the Topology by an addition Node:
  - 2. Add an ARM Node
  - 3. Add a connection between the ARM and Win32 nodes
  - 4. Map one Task and one Hub onto the new ARM Node
  - 5. Compile Project
  - 6. Flash ARM node and Execute

# **Properties of the ARM Node**



- Based on Luminary Micro LM3S6965.
- ARM-Cortex-M3 @ 50MHz
- 64kB RAM
- 256kB Flash
- Communicating either via:
  - RS232 @ 921600bps
  - 100Mbps Ethernet (TCP-IP)

# **Open Tracer**



Goal: Obtain a trace from the Semaphore Loop running on the ARM and Windows:

- Add a Stdio-Host-Server to the Win32 Node.
- Write the contents of the ARM Node trace buffer onto the disk of the Win32 Node.
- Write the contents of the Win32 Node trace buffer onto the disk of the Win32 Node.
- Display the Trace using OpenTracer.

# **Open System Inspector**



- Goal: Investigate and influence the State of the System during runtime:
  - Starting from the `Distributed Semaphore Loop' example
  - Add two OSI-Server components, one for each Node.
  - Add a OSI-Relay component to the Win32-Node.
  - Build and run
  - Start the Open System Inspector (OSI) and load the project.
  - Investigate the state of the system and influence it.

# **Protecting a Shared Resource**



Goal: share one Screen between an ARM Node and a Windows Node:

- Insert a Resource, which provides mutual exclusive access to the StdioHostServer.
- Claim the Resource using L1\_LockResource\_W() before accessing the StdioHostServer.
- Release the Resource by calling L1\_UnlockResource\_W()

# Safe Virtual Machine for C



Goal: Make Tasks loadable during runtime, and have a standard binary format for them (ARM Thumb-1)

- Starting from the `Single Node Semaphore Loop' example
- Add an SVM node to the Topology Diagram
- Add an SVM-Component to the Application Diagram and map it to the Win32-Node, this is the VM.
- Map one of the tasks to the Node called `svm'. Thus now it will be compiled into an ARM-Thumb1 binary
- Modify a native task to load the binary image (Taskname.bin), and then start the VM.

## **Interrupt Latency**



This demo measures two separate latencies using the Timer IRQ:

- IRQ to ISR --- How long does it take after an IRQ occurred until the first useful statement in the ISR gets executed.
- IRQ to Task --- How long does it take after an IRQ occurred until the first useful statements in the Task handling this IRQ gets executed.

# eWheel Controller Simulation



Right

Left

SPEED (km/h)

This demonstration simulates a Segway type wheel, and consists of the following parts:

eWheel Demo 1.0.1.6

- eWheel Visualisation
- eWheel Controller
- Physical Model



## Performance



- Code-size Figures
- Task switching Figures
- Interrupt Latency

#### **OCR Code-size Figures**



| СРИ Туре      | Codesize    |  |  |
|---------------|-------------|--|--|
| ARM-Cortex-M3 | 2.5 – 4.0kB |  |  |
| XMOS-XS1      | 5.0 – 7.5kB |  |  |
| PowerPC e600  | 7.1 – 9.8kB |  |  |
| TI-C66x (DSP) | 5.1 – 7.7kB |  |  |

Code-size depends on the application, the system automatically removes unused services.

# **Task Switching Figures**





|               | Memory           | Loop Time   |
|---------------|------------------|-------------|
| ARM-Cortex-M3 | internal         | 2360 cycles |
| XMOS-XS1      | internal         | 2130 cycles |
| PowerPC e600  | Simulator (psim) | 1638 cycles |
| TI C66x (DSP) | L2-SRAM          | 4470 cycles |

# **Interrupt Latency Measurement**





- IRQ 2 ISR: The time that elapsed between the IRQ and the first useful instruction of the ISR.
- IRQ 2 Task: The time that elapsed between the IRQ and the first useful instruction of a Task triggered by the ISR.

29 August 2011

# **Interrupt Latency Figures**



|                   | Memory    | IRQ 2 ISR              | IRQ 2 Task                |
|-------------------|-----------|------------------------|---------------------------|
| ARM-Cortex-<br>M3 | internal  | 15 – 81;<br>(50%: 20)  | 600 – 1200;<br>(50%: 800) |
| XMOS-XS1          | internal  | 73 – 142;<br>(50%: 88) | 600 – 1100;<br>(50%: 700) |
| PowerPC e600      | Simulator | 70                     | 896                       |
| TI C66x (DSP)     | L2-SRAM   | 136                    | 1367                      |

- IRQ 2 ISR: The time that elapsed between the IRQ and the first useful instruction of the ISR.
- IRQ 2 Task: The time that elapsed between the IRQ and the first useful instruction of a Task triggered by the ISR.

# IRQ 2 ISR on XMOS 100MHz



| Jatency Demo 1.7.1.9 [Xmos XS1 100MHz 100MHz, OpenComRTOS 1.4]                                                                                                                                 |                    |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|
| IRQ to Task Latency IRQ to ISR Latency                                                                                                                                                         |                    |
| Lin Log Reset Profile<br>Lin Log Reset Profile<br>10000<br>5000<br>2500<br>10000<br>500<br>2500<br>1000<br>500<br>1000<br>500<br>1000<br>500<br>1000<br>500<br>1000<br>500<br>1000<br>500<br>5 | Hz minimal: 6 usec |

29 August 2011

# IRQ 2 Task on XMOS 100MHz





# Conclusions



- OpenComRTOS Designer allows you to master the complexity of distributed heterogeneous systems.
- OpenComRTOS has a small memory foot-print.
- OpenComRTOS has a high performance.
- Trace information from embedded targets can be obtained without using expensive instrumentation.
- Open System Inspector allows to inspect a running system.



# **Questions?**

### **Thank You for your attention**



*"If it doesn't work, it must be art. If it does, it was real engineering"*